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An  Exact  Small  Sample  Test  of  Non-Nested  Hypotheses 

Summary 

Classical  hypothesis  testing  typically  validates  or  invalidates  a 
model  by  nesting  it  in  a  more  general  model  and  performing  a  likelihood 
ratio  test.   When  models  are  not  nested  there  does  not  seem  to  exist  a 
practical  way  of  comparing  them.   Generalizing  the  classical  statistical 
methodology  of  the  likelihood  ratio,  we  develop  a  practical,  exact,  small 
sample  test  of  non-nested  hypotheses.   The  test  may  be  useful  because  it 
can  allow  one  to  check  the  validity  of  a  specification  directly.   As  an 
application,  we  find  that  time  series  data  may  not  support  the  existence 
of  an  aggregate  production  function  for  the  U.S.  economy. 
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Introduction 

All  of  us  have  been  exposed  to  situations  where  an  elaborate  econometric 
model,  supposedly  based  on  theory,  hardly  fits  the  data  better  than  some 
simple  minded  ad  hoc  specification.   Given  any  model,  if  there  are 
easily  available  alternative  ways  of  organizing  data  which  yield  results 
"too  good  to  be  true"  in  some  sense,  that  should  cast  doubt  on  the 
authenticity  of  the  original  specification.   The  discrediting  alternative 
need  not  be  theoretically  justified;  it  need  not  pretend  to  any  theoretical 
content  whatsoever.   Intuitively,  there  ought  to  be  an  exact  sense  in 
which  something  is  suspicious  about  a  model  that  is  supposed  to  be  correct, 
when  casually  chosen  alternatives  also  fit  so  well. 

This  approach  both  implies  and  is  implied  by  a  particular  philosophy 
of  modeling  which  can  be  simply  stated  as  follows:   should  we  find  that  a 
particular  model  implies  unlikely  consequences,  it  should  lead  us  to  question 
or  even  perhaps  to  reject  the  underlying  specification. 

The  fundamental  barrier  to  formalizing  such  ideas  has  been  the  lack 
of  an  operational  test  for  non-nested  hypotheses.   A  procedure  like  the 
t-test,  for  example,  automatically  screens  out  such  foolish  behavior  as 
accepting  a  linear  model  with  a  zero  coefficient  when  an  obvious  alterna- 
tive fits  the  data  much  better.   But  the  inability  to  compare  non-nested 
hypotheses  is  a  very  severe  limitation  which  prevents  us  from  inferring 
that  a  model  as  a  whole  may  be  incorrectly  specified  when  some  other 
model  is  fitting  too  well  relative  to  it.   Nor  does  the  trick  of 
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artificially  nesting  two  equally  well  fitting  alternatives  in  a  more 
general  model  help  much  in  practice,  because  multi-collinearity  will  almost 
surely  prevent  either  alternative  from  rejecting  the  other. 

In  this  paper  we  offer  an  exact  small  sample  test  of  non-nested 
models  which  directly  generalizes  the  classical  F-test.   In  most  cases 
there  should  be  no  difficulty  in  performing  our  test  routinely  on  the 
computer,  although  it  is  definitely  a  more  demanding  computation  than  the 
classical  tests  to  which  it  reduces  in  a  nested  environment. 

The  test  statistic  we  use  is  the  likelihood  ratio  proposed  by  Cox 
[1961],  [1962],  and  subsequently  applied  by  Pesaran  [1974].   Our  own 
philosophy  of  modeling  closely  parallels  the  exposition  of  Pesaran  and 
Deaton  [1978],  q.v. ,  except  that  we  would  go  further  in  asserting  the 
"other"  hypothesis  can  be  used  to  reject  a  maintained  hypothesis  even  with- 
out any  pretence  whatsoever  about  the  viability  of  the  rejection  model  as 
an  alternative. 

The  main  contribution  of  the  present  paper  consists  of  showing  that 
there  is  essentially  no  computational  barrier  to  getting  as  close  as  we 
want  to  the  exact  distribution  of  the  likelihood  ratio  statistic  for  non- 
nested linear  models.   By  looking  at  the  problem  slightly  unconventionally, 
it  is  possible  to  obtain  an  exact,  small  sample  test  which  avoids  altogether 
the  pitfalls  of  asjanptotic  large  sample  theory. 

Our  feeling  is  that  the  test  we  propose  is  powerful  and  is  likely 
to  prove  damaging  to  certain  classes  of  models.   As  an  example,  we  show 
that  the  existence  of  a  stable  aggregate  production  function  for  the  U.S. 
economy  may  be  questionable  because  a  general  formulation  with  capital 
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and  labor  explains  the  growth  of  output  significantly  worse  (or 
insignificantly  better)   than  a  specification  which  is  identical  except 
for  omitting  capital  and  labor  altogether. 

Note  that  in  our  framework  it  does  not  make  sense  to  talk  about 
proving  a  model  is  "true".   A  model  can  never  be  proved  true;  it  can 
only  be  discredited,  which  occurs  if  it  is  found  to  result  in  unlikely 
coincidences.   A  model  remains  a  viable  hypothesis  only  under  circum- 
stances in  which  other  data  or  other  models  do  not  yield  an  implausibly 
good  fit,  considering  the  original  model  is  supposed  to  be  correct. 
Models  are  viable,  as  it  were,  only  by  default.   Any  given  field  may 
or  may  not  be  characterized  by  a  viable  model,  and  the  situation  can 
change  over  time.   In  our  view  this  is  a  correct  description  of  how 
science  works.   It  is  quite  possible  to  be  in  an  agnostic  situation, 
temporarily  or  permanently,  where  no  single  hypothesis  is  able  to  establish 
itself  against  refutation.   If  a  model  has  established  itself  (by  de- 
fault) as  viable,  it  may  at  any  time  be  discredited  by  newly  conceived 
specifications  or  changed  data.   And  the  discrediting  model  does  not  have 
to  be  viewed  as  a  candidate  to  replace  the  discredited  model,  since  it 
too  may  be  non-viable. 


-5- 


Formal  Description  of  the  Test 

The  model  being  tested,  called  the  "hypothesis  model"  is  denoted  (H) . 
Another  model,  denoted  (R) ,  may  be  used  to  possibly  reject  (H) .   It  is 
assumed  that  the  hypothesis  and  rejection  models  satisfy: 

(H)    Y  =  X       B     +         e  X  and  £  independent 

(Txl)  (Txk  )   (k  xl)      (Txl)  .._,   ,,,„   2, 

X    ^  X  £.  xid  N(0,a  ), 

(R)   f(Y)  =  Z      Y      +6         Z  and  E  independent. 
(Txl)  (Txk  )   (k  xl)      (Txl) 

Z       Z 

X  and  Z  represent  data  that  might  coninclde  in  part,  or  might  be 
completely  different.   f(Y)  denotes  some  transformation  of  Y,  such  as 
f.(Y'.)  =  logY.,  or  f(Y)  =  Y  +  V  for  some  vector  V.   The  most  common 
transformation  used  in  practice  would  be  the  identity  f(Y)  =  Y.   The 
rejection  model  (R)  is  not  assumed  to  be  a  reasonable  "alternate  hypothesis" 
to  (H) ,  or  to  satisfy  any  statistical  properties  beyond  independence  of 
Z  and  e . 

Suppose  we  consider  performing  OLS  regressions  on  (H)  and  (R) ,  and 

using  the  ratio  of  the  sum  of  squared  residuals  (SSR)  as  a  test  statistic 

X: 

SSR.  =  (Y-X3)  '  (Y-X3)  =  Y'M  Y 

-1  ~1 

3  =  (X'X)   X'Y  ,  M  =  I  ~  X(X'X)   X' 

SSR^  =  (f(Y)-ZY)'(f(Y)-ZY)  =  f(Y)'M^f(Y) 

-1  -1 

Y  =  (Z'Z)   Z'f(Y)  ,  M  =  I  -  Z(Z'Z)   Z' 
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and 


A  = 


^  f!\  ^  f(Y)'Mzf(Y) 

SSR^     Y'M  Y  ""   •  (2) 


Under  the  assumption  that  (H)  is  the  correct  model,  suppose  we 
could  somehow  obtain  the  distribution  G(A)  of  our  test  statistic.   Then, 
following  the  classical  likelihood-ratio  approach,  we  could  specify  a 
critical  region  [0,c],  and  reject  model  (H)  if  A  e  [0,c].   Choosing 
c  =  A,  we  would  always  reject  (H) ,  and  the  probability  that  this  decision 
is  wrong  is  given  by 

a  =  I   dG(A). 
J  o 

That  is,  a  is  the  probability  of  making  a  type  I  error,  or  the  level  of 
significance  of  the  test. 

In  this  framework,  we  reject  (H)  at  a  high  level  of  significance 
(low  a)  only  when  the  test  statistic  A  is  much  lower  than  we  would  expect 
if  (H)  were  true.   That  is,  should  we  find  that  assuming  (H)  true  leads 
to  an  unlikely  event  (i.e.,  low  A),  we  conclude  that  the  hypothesis  model 
cannot  be  the  correct  specification.   Our  test  is  thus  a  formalization 
of  what  we  mean  by  a  model  being  "untrue",  i.e.,  its  acceptance  leads 
to  unlikely  consequences. 
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The  distribution  G(X)  is  derived  as  follows.   Assuming  that  (H) 
is  true,  we  can  substitute  Y  =  xB+£  into  (2)  to  obtain, 

^  f(XB+£)'Mzf(X3+c)  ,, 

e'M  e  '  ^  ^ 

X 

Now  suppose  that  6  did  not  appear  in  (2'),  e.g.,  suppose  we  could 
rewrite  the  expression  so  that  3  cancels  out  (this  actually  occurs  in 
the  classical  case  of  nested  hypotheses).   Then  one  method  of  obtaining 
the  distribution  of  X   is  to  simply  simulate  e  using  a  random  number 
generator.   Denoting  simulated  values  with  a  tilde  we  would  have 

?^  _  f(xB+£VM^f(x3+^)  (.. 
iv  K,  ^^^ 

e'M  £ 

X 

Repeating  this  simulation  many  times,  we  would  obtain  a  frequency 
distribution  which  has  been  derived  under  the  assumption  that  model  (H) 
is  correct,  so  with  enough  simulations  G  would  approach  the  actual  distribu- 
tion G.   Our  hypothesis  test  could  then  be  performed. 

Of  course,  the  unknown  coefficients  3  need  not  cancel  out  of  (2') 
or  (3),  except  in  the  special  case  of  nested  models.   But  under  the  assump- 
tion  that  (H)  is  correct,  3  has  a  precise  relation  to  3  and  £,  namely, 

3  =  (X'X)'-^X'Y  =  (x'X)"''"X'  (X3+e)  =  3  +  (X'X)"-""  X'e 


Using  the  assumption  of  "flat"  priors  on  3  (which  underlies  classical 
statistics),  we  can  use  Bayes'  law  to  invert  the  above  expression,  obtaining 
a  posterior  distribution  of  3   as  a  function  of  e.    conditional  on  the 
observed  3- 

3  =  3-  (X'X)"-'-X'c.  (4) 

In  our  simulations,  posterior  values  of  3  are  calculated  from  the 

%   "^        — ]_   a.  % 

formula  3=3-  (X'X)   X'e.   Substituting  3  for  3  in  (3),  we  derive 


^  ^   f(X3  +  Mx£)'Mzf(X3  +  M^£) 
e'M  e 

X 

From  (3')  we  see  that  each  simulation  merely  requires  the  evaluation 
of  linear  and  quadratic  forms  in  £  with  a  fixed  matrix  and  does  not  require 
inverting  a  new  matrix  at  each  step.   This  crucial  feature  makes  our  exact, 
small  sample  test  an  absolutely  routine  calculation.   It  is  derived  from  two 
underlying  assumptions.   First,  the  (H)  and  (R)  models  must  be  linear  in 
3  and  y.   Second,  X  and  Z  must  be  independent  of  £,  since  otherwise  when 
we  simulate  £  we  must  also  simulate  X  and  Z,  and  re-compute  M  and  M  .   Note, 
however,  that  transformations  of  the  form  f(Y)  can  easily  be  performed.   Our 
experience  is  that  the  test  can  be  calculated  in  a  routine  fashion  on  the 
computer  and  could  be  programmed  as  a  standard  part  of  a  regression  package. 

Our  method  for  obtaining  G  is  now  complete  except  for  one  detail:   in 

'Xj 

order  to  generate  e  we  must  know  its  variance.   The  £.  are  assumed  to  be 

2        2 
iid  N(0,a  ),  and  a  is  estimated  as 

'2   Y'M„Y 

0  =  ,.  '■- 

(T-k^) 
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Under  the  assumption  that  (H)  is  the  correct  model,  we  have 

^Z  _  ^  -^     =     V  CT-k  > 

X 

Consistent  with  classical  statistics,  uniform  or  "flat"  priors  on 
loga  yields  the  posterior  distribution, 

2 
In  practice,  posterior  values  of  a  are  calculated  from  the  formula 

'^'2        .'■2>2        '^2 

c  =  (T-k  )a  /x  ,  where  x  is  obtained  from  a  random-number  generator. 

'\^2  0/  % 

For  each  drawing  of  x  we  generate  a  vector  e,  and  then  compute  A 

from  (3').   Repeating  this  simulation  many  times  we  obtain  the  distribu- 
tion  G,  derived  under  the  assumption  that  (H)  is  correct.   The  hypothesis 
test  described  above  can  thus  be  routinely  performed  on  the  computer. 
We  are  calculating  the  probability  a  that  the  observed  test  statistic 
A  should  be  as  small  as  it  is, given  the  maintained  hypothesis  (H) .   The 
error  in  estimating  a  can  easily  be  calculated  from  the  binomial  distribu- 
tion:  with  N  simulations,  var(a)  =  a(l-a)/N. 

Our  procedure  for  testing  (H)  using  (R)  shows  why  the  rejection 
model  itself  need  not  be  believable:   at  no  point  have  we  assumed  that 
(R)  satisfies  any  desirable  statistical  properties,  beyond  independence 
of  Z  and  £.   This  feature  is  closely  related  to  our  methodological  view 
that  models  can  be  rejected,  but  not  accepted  (except  by  default).   To 
formally  accept  a  model,  we  need  to  know  the  probability  that  this  deci- 
sion is  wrong  (i.e.,  the  type  II  error).   In  other  words,  we  must  know 
the  distribution  of  our  test  statistic  X  when  (H)  is  not  the  correct 
specification.   But  if  (H)  is  not  correct,  then  we  do  not  know  the  "true" 


-10- 


specif ication:   in  some  cases  (R)  might  be  a  reasonable  candidate,  but 
there  are  surely  many  other  possibilities.   In  principle,  one  can  imagine 
assigning  priors  to  all  possible  specifications,  and  then  integrating  over 
the  space  of  models  to  obtain  the  probability  of  a  type  II  error.   In 
practice,  such  a  computation  could  obviously  not  be  perfonned.   Note  that 
reversing  the  roles  of  (H)  and  (R)  simply  means  that  (H)  could  be  used  to 
reject,  but  not  accept,  model  (R) . 

While  we  have  used  some  elementary  Bayesian  methodology  in  deriving 
the  distribution  G(A.),  our  procedure  for  testing  non-nested  hypotheses 
is  really  no  more  Bayesian  than  any  other  aspect  of  classical  statistics. 
In  fact,  our  test  is  a  direct  generalization  of  the  classical,  nested 
likelihood-ratio  tests. 

The  classical  statistical  approach  to  analyzing  the  regression  model 

Y  =  X6+e  is  consistent  with  the  Baynesian  approach  of  using  a  quadratic 

2 
loss  function,  and  diffuse  priors  on  6  and  loga  .   For  example,  the 

standard  t-test  for  the  hypothesis  3,  =  b,  is  equivalent  to  obtaining 

the  posterior  distributions  (4)  and  (5) ,  and  then  testing  whether  b   is 

likely  to  have  been  drawn  from  the  distribution  of  3-,.   That  is,  we 

exclude  the  upper  and  lower  lOOa/2  percentiles  of  3-,  ,  and  then  observe 

whether  b  falls  into  the  middle  range.   This  approach  is  computationally 

identical  to  the  t-test.   Observe,  however,  that  our  formulation  does  not 

depend  on  a  fortuitous  cancelling  out  of  3  in  (2')  which  may  make  it 

a  more  appealing  conceptual  approach  for  understanding  statistical 
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test ing,  even  for  nested  hypotheses. 

We  note  that  for  the  special  case  of  nested  models,  our  procedure 

reduces  to  the  standard  F  or  t  tests.   As  an  example,  consider  testing 

the  hypothesis  that  certain  coefficients  of  a  model  equal  zero.   Then  the 

data  matrix  X  of  (H)  includes  only  a  subset  of  the  variables  in  Z.   It 

follows  that  M  X  =  0,  and  so  with  f(Y)  =  Y  (2')  becomes 
z  ' 

X 

That  is,  the  test  statistic  X  is  simply  the  SSR  of  the  unrestricted 
model  divided  by  the  SSR  of  the  restricted  model.   A  simple  transformation 
of  X,  i.e. 

(l-X)/(k  -k  ) 

z   X 


X/(T-k^) 


is  distributed  as  F(k  -k  ,T-k  ). 

z   X     z 

That  classical  nested  hypothesis  testing  with  its  well  known 
statistical  properties  is  a  special  case  of  our  generalization  suggests 
to  us  the  general  test  is  likely  to  be  "powerful"  in  some  sense.   We 
expect  that  to  pass  the  A  test,  the  hypothesis  model  in  most  cases  has 
to  perform  somewhat  better  than  a  casually  specified,  empirically 
oriented  rejection  model. 
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More  work  is  needed  to  determine  the  properties  of  our  test 
under  general  and  specific  circumstances.   As  one  special  case,  consider 
testing  a  model  against  itself  (i.e.,  (H)  =  (R)).   Substituting  f(Y)  =  Y 
and  Z  =  X  into  (3'),  the  simulated  test  statistic  becomes 

£'M  e 

^  =  n ^  =   1- 

e'M  e 

X 

The  observed  test  statistic  is  also  X  =  1.   In  other  words,  we  are 
observing  an  event  which  occurs  with  probability  one,  and  so  we  certainly 
cannot  reject  (H)  (a=l) .   This  suggests  that  testing  a  model  using  another 
which  is  "similar"  will  not  lead  to  a  rejection.   Intuitively,  some  sort 
of  orthogonality  between  X  and  Z,  along  with  a  good  fit  for  (R) ,  should 
lead  to  stronger  tests  (lower  a). 

Finally,  note  that  while  we  are  allowed  to  choose  (R)  by  comparing 
the  X  and  Z  matrices,  we  definitely  cannot  choose  the  rejection  model  by 
first  observing  the  results  of  the  test,  and  then  searching  for  a  Z  with 
stronger  results:   the  latter  procedure  would  violate  the  assumption  of 
independence  between  Z  and  £. 
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An  Application 

Without  taking  the  example  too  seriously,  the  aggregate  production 
function  nicely  illustrates  some  of  the  points  we  are  trying  to  make. 

In  the  aggregate  U.S.  private  economy  for  year  t,  let 

Y(t)  =  actual  output 

Y(t)  =  potential  output 

E(t)  =  emploj^ent  rate 

L(t)  =  labor 

K(t)  =  capital 

Suppose  we  postulate  an  Okun's  law  type  of  relationship  between 
potential  and  actual  output  of  the  form 

Y(t)  =  Y(t)   E(t)^.  (6) 

By  this  rule  a  one  point  increase  in  unemployment  decreases  output  by 

Y%. 

If  potential  output  is  accurately  described  by  a  constant  returns 
to  scale  production  function  of  the  general  form  F(K,L)  with  Hicks 
neutral  technical  change,  then 

Y(t)  =  A(t)  F(K(t),L(t)).  (7) 

In  practice  we  have  chosen  A(t)  to  be  of  the  exponential  form 

A(t).Ae*'.  "> 
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and  verified  that  more  general  specifications  like 

2 
.  /  \    .   A  t  "T  A  t  ,_^ 

A(t)  =  A  e  (9) 


do  not  negate  our  conclusions. 

Logarithmically  differentiating  (6),  (7),  and  (8),  we  can 
derive  the  "growth  equation" 

Sy  =  ^Vk  +  \^l^    +  ^  +  ^§e-  (^°^ 


and 


In  formula   (10) , 

s 

• 

_  Y 

Y 

\ 

• 

K 

=     

K 

H 

• 

L 
L 

^e 

• 

_  E 
E 

\ 

Y 

\ 

^    9L 
Y      ' 
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Under  the  assumption  of  constant  returns  to  scale  and  a  marginal 
productivity  theory  of  value,  n.  and  rip  can  be  obtained  as  empirical 
factor  shares.   Knowing  grox-fth  rates  of  capital  and  labor,  we  can  then 
calculate  the  factor  growth  contribution  term  (l,  g,  +  'Ungn)    for  each 
year. 

Now  a  legitimate  question  to  ask  is  whether  the  factor  growth 
contribution  to  equation  (10)  is  "doing  anything"  more  to  explain  the 
growth  of  output  over  and  above  what  is  explained  by  the  obvious  base 
case  of  a  constant  long-term  trend  modified  by  a  short-term  correction  for 
economic  activity.   Is  the  sophisticated  production  function  specification 
of  potential  output  (7)  statistically  superior  to  the  simple  constant 
growth  alternative 

Y(t)  =  Be^*"  (11) 

which  is  frankly  empirical? 

In  the  language  of  this  paper,  if_  the  true  specification  is 

S  =  ^Vk  ■"  \H^   +  \  -^  ^1  Se  +  ^  (12) 

where  e  is  i.i.d.  normal,  how  likely  is  it  that  a  naive  ad  hoc  specifica- 
tion which  omits  the  production  function  part  altogether 

should  fit  the  data  as  well  as  it  does?  This  way  of  posing  the  question 
seems  like  a  fair  way  to  inquire  about  the  statistical  likelihood  of  a 
stable  production  function  relation  between  aggregate  output,  capital, 
and  labor.   Observe  that  (12)  and  (13)  are  not  nested  models  in  the  usual 
statistical  sense,  although  they  are  related  to  each  other  in  a  particularly 
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2 
simple  fashion  through  a  zero-or-oiie  coefficient.   Thus,  our  basic 

question  about  the  role  of  the  aggregate  production  function  in  explaining 

growth  cannot  be  answered  directly  within  the  classical  framework. 

Note  that  we  have  imposed  no  conditions  on  the  specific  form  of 
the  production  function  F(K,L),  having  used  only  the  assumption  of 
competitive  shares  and  constant  returns  to  scale. 

The  data  used  are  an  updated  and  revised  version  of  the 

Christianson  ~  Jorgenson  series  on  the  aggregate  private  economy  from 

3 

the  Review  of  Income  and  Wealth.   The  primary  period  investigated 

was  1947-1978,  although  in  all  cases  we  have  verified  the  legitimacy 
of  our  results  over  the  longer  span  1929-1978. 

The  ESS  for  the  regression  based  on  (12)  is  7.072E-03(with  a  DW 
of  2.15)  whereas  the  ESS  is  7.024E-03  for  the  regression  based  on  (13). 
Thus,  omitting  capital  and  labor  altogether  gives  a  slightly  better  fit 
than  if  the  production  function  were  included  in  the  growth  equation. 
More  importantly,  the  difference  is  highly  significant.   The  probability 
that  (13)  would  fit  as  well  as  it  does  if  (12)  were  the  true  specifica- 
tion is 

a  =  .06  +  .01 

(based  on  1000  simulations).   In  this  sense,  we  tend  to  reject  the 
production  function  specification  at  a  high  level  of  significance. 
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Conclusion 

Our  guess  is  that  the  rejection  of  so-called  "theoretically 
justified"  economic  models  by  empirical,  ad-hoc  specifications  might 
be  a  pervasive  feature  in  several  (although  by  no  means  all)  areas  of 
applied  econometric  work. 

Often  what  we  deal  with  in  economics  is  essentially  a  collection 
of  time  series,  somewhat  erratic  but  on  the  whole  growing  together.   It 
is  tempting  to  read  into  this  mild  chaos  some  order,  and  an  economist 
naturally  tries  to  impose  order  by  fitting  an  economic  model  —  for 
example  a  production  function.   But  we  fear  the  sad  truth  is  that  many 
of  our  "theoretically  justified"  models  may  not  be  fitting  the  data 
much  better  than  clever  ad  hoc  specifications.   And  this  would  mean, 
in  a  sense  we  have  tried  to  make  rigorous,  that  the  "theoretically 
justified"  model  is  suspect  on  empirical  grounds,  however  strong  the 
desire  or  motivation  to  believe  it  is  true. 

Failing  our  test  does  not  necessarily  mean  the  hypothesis  model 
must  be  abandoned.   But  it  can  signal  us  when  to  be  more  modest  in  our 
claims,  more  tentative  in  our  applications,  and  more  alert  for  alterna- 
tive explanations. 

We  hope  the  exact  test  we  have  proposed  can  be  constructively  used 
to  sift  out  those  models  where  there  exists  an  empirically  viable  struc- 
ture from  other  situations  where  our  own  wishful  thinking  is  making  us 
impose  a  theoretical  grouping  that  perhaps  nature  does  not  intend. 


FOOTNOTES 


we  checked  these  results  empirically  by  performing  several  nested 
ests,  and  coirparing  the  values  of  a  obtained  from  a  t  or  F  distribution 


ith  our  simulations: 

a  from 

Form  of  test 

a  from  t  or  F  dist. 

simulations 

h   =  0  (t) 

0.640 

0.625 

6.  =  0  (t) 

0.776 

0.760 

3=^2=0  (F)  0.88A  0.890 

The  number  of  simulations  in  each  case  was  1,000. 

2 
Of  course  models  (12)  and  (13)  could  be  artificially  nested  by 

introducing  a  free  coefficient  on  the  factor  growth  contribution  term. 

However,  this  would  not  be  the  same  test  as  we  propose,  '.[n   fact  it  js 

statistically  much  weaker  because  of  the  multicollinearit^'  typically 

introduced  by  artificial  nesting.   The  estimated  coefficif'nt  is  .34 

with  a  standard  error  of  .42. 

3 
We  thank  Dale  Jorgenson  for  providing  us  with  this  data. 
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